Search CORE

7,921 research outputs found

Learning to Play Othello with N-Tuple Systems

Author: Lucas Simon M
Publication venue
Publication date: 01/01/2008
Field of study

This paper investigates the use of n-tuple systems as position value functions for the game of Othello. The architecture is described, and then evaluated for use with temporal difference learning. Performance is compared with previously de-veloped weighted piece counters and multi-layer perceptrons. The n-tuple system is able to defeat the best performing of these after just five hundred games of self-play learning. The conclusion is that n-tuple networks learn faster and better than the other more conventional approaches

University of Essex Research Repository

CiteSeerX

Temporal difference learning with interpolated table value functions

Author: Lucas Simon M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/10/2009
Field of study

This paper introduces a novel function approximation architecture especially well suited to temporal difference learning. The architecture is based on using sets of interpolated table look-up functions. These offer rapid and stable learning, and are efficient when the number of inputs is small. An empirical investigation is conducted to test their performance on a supervised learning task, and on themountain car problem, a standard reinforcement learning benchmark. In each case, the interpolated table functions offer competitive performance. ©2009 IEEE

University of Essex Research Repository

Crossref

Investigating learning rates for evolution and temporal difference learning

Author: Lucas Simon M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2008
Field of study

Evidently, any learning algorithm can only learn on the basis of the information given to it. This paper presents a first attempt to place an upper bound on the information rates attainable with standard co-evolution and with TDL. The upper bound for TDL is shown to be much higher than for coevolution. Under commonly used settings for learning to play Othello for example, TDL may have an upper bound that is hundreds or even thousands of times higher than that of coevolution. To test how well these bounds correlate with actual learning rates, a simple two-player game called Treasure Hunt. is developed. While the upper bounds cannot be used to predict the number of games required to learn the optimal policy, they do correctly predict the rank order of the number of games required by each algorithm. © 2008 IEEE

University of Essex Research Repository

Crossref

Approximating n-player behavioural strategy nash equilibria using coevolution

Author: Lucas Simon
Samothrakis Spyridon
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

Coevolutionary algorithms are plagued with a set of problems related to intransitivity that make it questionable what the end product of a coevolutionary run can achieve. With the introduction of solution concepts into coevolution, part of the issue was alleviated, however efficiently representing and achieving game theoretic solution concepts is still not a trivial task. In this paper we propose a coevolutionary algorithm that approximates behavioural strategy Nash equilibria in n-player zero sum games, by exploiting the minimax solution concept. In order to support our case we provide a set of experiments in both games of known and unknown equilibria. In the case of known equilibria, we can confirm our algorithm converges to the known solution, while in the case of unknown equilibria we can see a steady progress towards Nash. Copyright 2011 ACM

University of Essex Research Repository

Crossref

Forcing neurocontrollers to exploit sensory symmetry through hard-wired modularity in the game of Cellz

Author: Lucas Simon M.
Togelius Julian
Publication venue
Publication date: 01/01/2005
Field of study

Several attempts have been made in the past to construct encoding schemes that allow modularity to emerge in evolving systems, but success is limited. We believe that in order to create successful and scalable encodings for emerging modularity, we first need to explore the benefits of different types of modularity by hard-wiring these into evolvable systems. In this paper we explore different ways of exploiting sensory symmetry inherent in the agent in the simple game Cellz by evolving symmetrically identical modules. It is concluded that significant increases in both speed of evolution and final fitness can be achieved relative to monolithic controllers. Furthermore, we show that a simple function approximation task that exhibits sensory symmetry can be used as a quick approximate measure of the utility of an encoding scheme for the more complex game-playing task

CogPrints Cognitive Sciences Eprint Archive

Ms Pac-Man versus Ghost Team CEC 2011 competition

Author: Lucas Simon M
Rohlfshagen Philipp
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/07/2011
Field of study

Games provide an ideal test bed for computational intelligence and significant progress has been made in recent years, most notably in games such as Go, where the level of play is now competitive with expert human play on smaller boards. Recently, a significantly more complex class of games has received increasing attention: real-time video games. These games pose many new challenges, including strict time constraints, simultaneous moves and open-endedness. Unlike in traditional board games, computational play is generally unable to compete with human players. One driving force in improving the overall performance of artificial intelligence players are game competitions where practitioners may evaluate and compare their methods against those submitted by others and possibly human players as well. In this paper we introduce a new competition based on the popular arcade video game Ms Pac-Man: Ms Pac-Man versus Ghost Team. The competition, to be held at the Congress on Evolutionary Computation 2011 for the first time, allows participants to develop controllers for either the Ms Pac-Man agent or for the Ghost Team and unlike previous Ms Pac-Man competitions that relied on screen capture, the players now interface directly with the game engine. In this paper we introduce the competition, including a review of previous work as well as a discussion of several aspects regarding the setting up of the game competition itself. © 2011 IEEE

University of Essex Research Repository

Crossref

Evolving controllers for simulated car racing

Author: Lucas Simon M.
Togelius Julian
Publication venue: IEEE Press
Publication date: 01/01/2005
Field of study

This paper describes the evolution of controllers for racing a simulated radio-controlled car around a track, modelled on a real physical track. Five different controller architectures were compared, based on neural networks, force fields and action sequences. The controllers use either egocentric (first person), Newtonian (third person) or no information about the state of the car (open-loop controller). The only controller that is able to evolve good racing behaviour is based on a neural network acting on egocentric inputs

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive

Arms races and car races

Author: Lucas Simon M.
Togelius Julian
Publication venue: Springer
Publication date: 01/01/2006
Field of study

Evolutionary car racing (ECR) is extended to the case of two cars racing on the same track. A sensor representation is devised, and various methods of evolving car controllers for competitive racing are explored. ECR can be combined with co-evolution in a wide variety of ways, and one aspect which is explored here is the relative-absolute fitness continuum. Systematical behavioural differences are found along this continuum; further, a tendency to specialization and the reactive nature of the controller architecture are found to limit evolutionary progress

CogPrints Cognitive Sciences Eprint Archive

Evolving robust and specialized car racing skills

Author: Lucas Simon M.
Togelius Julian
Publication venue: IEEE Press
Publication date: 01/01/2006
Field of study

Neural network-based controllers are evolved for racing simulated R/C cars around several tracks of varying difficulty. The transferability of driving skills acquired when evolving for a single track is evaluated, and different ways of evolving controllers able to perform well on many different tracks are investigated. It is further shown that such generally proficient controllers can reliably be developed into specialized controllers for individual tracks. Evolution of sensor parameters together with network weights is shown to lead to higher final fitness, but only if turned on after a general controller is developed, otherwise it hinders evolution. It is argued that simulated car racing is a scalable and relevant testbed for evolutionary robotics research, and that the results of this research can be useful for commercial computer games

CogPrints Cognitive Sciences Eprint Archive

Fast Approximate Max-n Monte Carlo Tree Search for Ms Pac-Man

Author: Lucas Simon
Robles David
Samothrakis Spyridon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/04/2011
Field of study

We present an application of Monte Carlo tree search (MCTS) for the game of Ms Pac-Man. Contrary to most applications of MCTS to date, Ms Pac-Man requires almost real-time decision making and does not have a natural end state. We approached the problem by performing Monte Carlo tree searches on a five player maxn tree representation of the game with limited tree search depth. We performed a number of experiments using both the MCTS game agents (for pacman and ghosts) and agents used in previous work (for ghosts). Performance-wise, our approach gets excellent scores, outperforming previous non-MCTS opponent approaches to the game by up to two orders of magnitude. © 2011 IEEE

University of Essex Research Repository

Crossref